61 research outputs found
Message Passing in C-RAN: Joint User Activity and Signal Detection
In cloud radio access network (C-RAN), remote radio heads (RRHs) and users
are uniformly distributed in a large area such that the channel matrix can be
considered as sparse. Based on this phenomenon, RRHs only need to detect the
relatively strong signals from nearby users and ignore the weak signals from
far users, which is helpful to develop low-complexity detection algorithms
without causing much performance loss. However, before detection, RRHs require
to obtain the realtime user activity information by the dynamic grant
procedure, which causes the enormous latency. To address this issue, in this
paper, we consider a grant-free C-RAN system and propose a low-complexity
Bernoulli-Gaussian message passing (BGMP) algorithm based on the sparsified
channel, which jointly detects the user activity and signal. Since active users
are assumed to transmit Gaussian signals at any time, the user activity can be
regarded as a Bernoulli variable and the signals from all users obey a
Bernoulli-Gaussian distribution. In the BGMP, the detection functions for
signals are designed with respect to the Bernoulli-Gaussian variable. Numerical
results demonstrate the robustness and effectivity of the BGMP. That is, for
different sparsified channels, the BGMP can approach the mean-square error
(MSE) of the genie-aided sparse minimum mean-square error (GA-SMMSE) which
exactly knows the user activity information. Meanwhile, the fast convergence
and strong recovery capability for user activity of the BGMP are also verified.Comment: Conference, 6 pages, 7 figures, accepted by IEEE Globecom 201
Low-Complexity and Information-Theoretic Optimal Memory AMP for Coded Generalized MIMO
This paper considers a generalized multiple-input multiple-output (GMIMO)
with practical assumptions, such as massive antennas, practical channel coding,
arbitrary input distributions, and general right-unitarily-invariant channel
matrices (covering Rayleigh fading, certain ill-conditioned and correlated
channel matrices). Orthogonal/vector approximate message passing (OAMP/VAMP)
has been proved to be information-theoretically optimal in GMIMO, but it is
limited to high complexity. Meanwhile, low-complexity memory approximate
message passing (MAMP) was shown to be Bayes optimal in GMIMO, but channel
coding was ignored. Therefore, how to design a low-complexity and
information-theoretic optimal receiver for GMIMO is still an open issue. In
this paper, we propose an information-theoretic optimal MAMP receiver for coded
GMIMO, whose achievable rate analysis and optimal coding principle are provided
to demonstrate its information-theoretic optimality. Specifically, state
evolution (SE) for MAMP is intricately multi-dimensional because of the nature
of local memory detection. To this end, a fixed-point consistency lemma is
proposed to derive the simplified variational SE (VSE) for MAMP, based on which
the achievable rate of MAMP is calculated, and the optimal coding principle is
derived to maximize the achievable rate. Subsequently, we prove the
information-theoretic optimality of MAMP. Numerical results show that the
finite-length performances of MAMP with optimized LDPC codes are about 1.0 -
2.7 dB away from the associated constrained capacities. It is worth noting that
MAMP can achieve the same performance as OAMP/VAMP with 0.4% of the time
consumption for large-scale systems.Comment: 6 pages, 6 figures, accepted at GLOBECOM 202
Capacity-Achieving MIMO-NOMA: Iterative LMMSE Detection
This paper considers a low-complexity iterative Linear Minimum Mean Square
Error (LMMSE) multi-user detector for the Multiple-Input and Multiple-Output
system with Non-Orthogonal Multiple Access (MIMO-NOMA), where multiple
single-antenna users simultaneously communicate with a multiple-antenna base
station (BS). While LMMSE being a linear detector has a low complexity, it has
suboptimal performance in multi-user detection scenario due to the mismatch
between LMMSE detection and multi-user decoding. Therefore, in this paper, we
provide the matching conditions between the detector and decoders for
MIMO-NOMA, which are then used to derive the achievable rate of the iterative
detection. We prove that a matched iterative LMMSE detector can achieve (i) the
optimal capacity of symmetric MIMO-NOMA with any number of users, (ii) the
optimal sum capacity of asymmetric MIMO-NOMA with any number of users, (iii)
all the maximal extreme points in the capacity region of asymmetric MIMO-NOMA
with any number of users, (iv) all points in the capacity region of two-user
and three-user asymmetric MIMO-NOMA systems. In addition, a kind of practical
low-complexity error-correcting multiuser code, called irregular
repeat-accumulate code, is designed to match the LMMSE detector. Numerical
results shows that the bit error rate performance of the proposed iterative
LMMSE detection outperforms the state-of-art methods and is within 0.8dB from
the associated capacity limit.Comment: Accepted by IEEE TSP, 16 pages, 9 figures. This is the first work
that proves the low-complexity iterative receiver (Parallel Interference
Cancellation) can achieve the capacity of multi-user MIMO systems. arXiv
admin note: text overlap with arXiv:1604.0831
Bridging the Granularity Gap for Acoustic Modeling
While Transformer has become the de-facto standard for speech, modeling upon
the fine-grained frame-level features remains an open challenge of capturing
long-distance dependencies and distributing the attention weights. We propose
\textit{Progressive Down-Sampling} (PDS) which gradually compresses the
acoustic features into coarser-grained units containing more complete semantic
information, like text-level representation. In addition, we develop a
representation fusion method to alleviate information loss that occurs
inevitably during high compression. In this way, we compress the acoustic
features into 1/32 of the initial length while achieving better or comparable
performances on the speech recognition task. And as a bonus, it yields
inference speedups ranging from 1.20 to 1.47. By reducing the
modeling burden, we also achieve competitive results when training on the more
challenging speech translation task.Comment: ACL 2023 Finding
NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches
Accurate dietary intake estimation is critical for informing policies and
programs to support healthy eating, as malnutrition has been directly linked to
decreased quality of life. However self-reporting methods such as food diaries
suffer from substantial bias. Other conventional dietary assessment techniques
and emerging alternative approaches such as mobile applications incur high time
costs and may necessitate trained personnel. Recent work has focused on using
computer vision and machine learning to automatically estimate dietary intake
from food images, but the lack of comprehensive datasets with diverse
viewpoints, modalities and food annotations hinders the accuracy and realism of
such methods. To address this limitation, we introduce NutritionVerse-Synth,
the first large-scale dataset of 84,984 photorealistic synthetic 2D food images
with associated dietary information and multimodal annotations (including depth
images, instance masks, and semantic masks). Additionally, we collect a real
image dataset, NutritionVerse-Real, containing 889 images of 251 dishes to
evaluate realism. Leveraging these novel datasets, we develop and benchmark
NutritionVerse, an empirical study of various dietary intake estimation
approaches, including indirect segmentation-based and direct prediction
networks. We further fine-tune models pretrained on synthetic data with real
images to provide insights into the fusion of synthetic and real data. Finally,
we release both datasets (NutritionVerse-Synth, NutritionVerse-Real) on
https://www.kaggle.com/nutritionverse/datasets as part of an open initiative to
accelerate machine learning for dietary sensing
Segment Anything Model for Medical Images?
The Segment Anything Model (SAM) is the first foundation model for general
image segmentation. It designed a novel promotable segmentation task, ensuring
zero-shot image segmentation using the pre-trained model via two main modes
including automatic everything and manual prompt. SAM has achieved impressive
results on various natural image segmentation tasks. However, medical image
segmentation (MIS) is more challenging due to the complex modalities, fine
anatomical structures, uncertain and complex object boundaries, and wide-range
object scales. SAM has achieved impressive results on various natural image
segmentation tasks. Meanwhile, zero-shot and efficient MIS can well reduce the
annotation time and boost the development of medical image analysis. Hence, SAM
seems to be a potential tool and its performance on large medical datasets
should be further validated. We collected and sorted 52 open-source datasets,
and build a large medical segmentation dataset with 16 modalities, 68 objects,
and 553K slices. We conducted a comprehensive analysis of different SAM testing
strategies on the so-called COSMOS 553K dataset. Extensive experiments validate
that SAM performs better with manual hints like points and boxes for object
perception in medical images, leading to better performance in prompt mode
compared to everything mode. Additionally, SAM shows remarkable performance in
some specific objects and modalities, but is imperfect or even totally fails in
other situations. Finally, we analyze the influence of different factors (e.g.,
the Fourier-based boundary complexity and size of the segmented objects) on
SAM's segmentation performance. Extensive experiments validate that SAM's
zero-shot segmentation capability is not sufficient to ensure its direct
application to the MIS.Comment: 23 pages, 14 figures, 12 table
- …